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Abstract 

In this paper, we study the degraded compound multi-receiver wiretap channel. 
The degraded compound multi-receiver wiretap channel consists of two groups of users 
and a group of eavesdroppers, where, if we pick an arbitrary user from each group of 
users and an arbitrary eavesdropper, they satisfy a certain Markov chain. We study 
two different communication scenarios for this channel. In the first scenario, the trans- 
mitter wants to send a confidential message to users in the first (stronger) group and 
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a different confidential message to users in the second (weaker) group, where both 
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messages need to be kept confidential from the eavesdroppers. For this scenario, we 
assume that there is only one eavesdropper. We obtain the secrecy capacity region 
for the general discrete memoryless channel model, the parallel channel model, and 
the Gaussian parallel channel model. For the Gaussian multiple-input multiple-output 
(MIMO) channel model, we obtain the secrecy capacity region when there is only one 
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user in the second group. In the second scenario we study, the transmitter sends a 
confidential message to users in the first group which needs to be kept confidential 
from the second group of users and the eavesdroppers. Furthermore, the transmitter 
sends a different confidential message to users in the second group which needs to be 
kept confidential only from the eavesdroppers. For this scenario, we do not put any 
restriction on the number of eavesdroppers. As in the first scenario, we obtain the 
secrecy capacity region for the general discrete memoryless channel model, the parallel 
channel model, and the Gaussian parallel channel model. For the Gaussian MIMO 
channel model, we establish the secrecy capacity region when there is only one user in 
the second group. 
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1 Introduction 



Information theoretic secrecy was initiated by Wyner in his seminal work [1] , where he con- 
sidered the degraded wiretap channel and established the capacity-equivocation rate region 
of this degraded channel model. Later, Csiszar and Korner generalized his result to arbitrary, 
not necessarily degraded, wiretap channels in [2] . In recent years, multi-user versions of the 
wiretap channel have attracted a considerable amount of research interest; see for example 
references [3-21] in [3]. Among all these extensions, two natural extensions of the wiretap 
channel to the multi-user setting are particularly of interest here: secure broadcasting and 
compound wiretap channels. 

Secure broadcasting refers to the situation where a transmitter wants to communicate 
with several legitimate receivers confidentially in the presence of an external eavesdropper. 
We call this channel model the multi-receiver wiretap channel. Since the underlying channel 
model without an eavesdropper is the broadcast channel, which is not understood to the full 
extent even for the two-user case, most works on secure broadcasting have focused on some 
special classes of multi-receiver wiretap channels, where these classes are identified by certain 
degradation orders [4-8]. In particular, [5-7] consider the degraded multi-receiver wiretap 
channel, where observations of all users and the eavesdropper satisfy a certain Markov chain. 
In [5], the secrecy capacity region is derived for the two-user case, and in [6,7], the secrecy 
capacity region is established for an arbitrary number of legitimate users. The importance of 
this result lies in the facts that the Gaussian multi-receiver wiretap channel belongs to this 
class, and the secrecy capacity region of the degraded multi-receiver wiretap channel serves 
as a crucial step in establishing the secrecy capacity region of the Gaussian multiple-input 
multiple-output (MIMO) multi-receiver wiretap channel [3], though the latter channel is not 
necessarily degraded. In [3], besides proving the secrecy capacity region of the Gaussian 
MIMO multi-receiver wiretap channel, we also present new optimization results regarding 
extremal properties of Gaussian random vectors, which we generalize here. 

Another extension of the wiretap channel that we are particularly interested in here, is 
the compound wiretap channel. In compound wiretap channels, there are a finite number of 
channel states determining the channel transition probability. The channel takes a certain 
fixed state for the entire duration of the transmission, and the transmitter does not have 
any knowledge about the channel state realization. Thus, the aim of the transmitter is to 
ensure the secrecy of messages irrespective of the channel state realization. In addition to 
this definition, the compound wiretap channel admits another interpretation. Consider the 
multi-receiver wiretap channel with several legitimate users and many eavesdroppers, where 
the transmitter wants to transmit a common confidential message to legitimate users while 
keeping all of the eavesdroppers totally ignorant of the message. Since each eavesdropper and 
legitimate user pair can be regarded as a different channel state realization, this channel is 
equivalent to a compound wiretap channel. Therefore, one can interpret a compound wiretap 
channel as multicasting a common confidential message to several legitimate receivers in the 
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presence of one or more eavesdroppers [9] . In this work, we mostly refer to this interpretation, 
which is also the reason why we classify the compound wiretap channel as an extension of 
the wiretap channel to a multi-user setting. 

Keeping this interpretation in mind, first works about the compound wiretap channel 
are due to Yamamoto [10, 11]. References [10, 11] consider the parallel wiretap channel 
with two sub-channels where each sub-channel is wiretapped by a different eavesdropper. 
References [10,11] establish capacity-equivocation rate regions for the situation where in each 
sub-channel, the legitimate receiver is less noisy with respect to the eavesdropper of this sub- 
channel. Other works which implicitly study the compound wiretap channel are [4,6-8,12], 
where [4,6,7] consider the transmission of a common confidential message to many legitimate 
receivers in the presence of a single eavesdropper, [8] focuses on two legitimate receivers one 
eavesdropper and one legitimate receiver two eavesdroppers scenarios, and [12] studies the 
fading wiretap channel with many receivers. Reference [9] considers the general discrete 
compound wiretap channel and provides inner and outer bounds for the secrecy capacity. 
In addition to these inner and outer bounds, [9] also establishes the secrecy capacity of 
the degraded compound wiretap channel as well as its degraded Gaussian MIMO instance. 
Another work on the compound wiretap channel is [13] where the secrecy capacity of a class 
of non-degraded Gaussian parallel compound wiretap channels is established. 

In this work, we consider compound broadcast channels from a secrecy point of view, 
which enables us to study the secure broadcasting problem over compound channels. We note 
that the current literature regarding the compound wiretap channel considers the transmis- 
sion of only one confidential message, whereas here, we study the transmission of multiple 
confidential messages, where each of these messages needs to be delivered to a different 
group of users in perfect secrecy. Hereafter, we call this channel model the compound multi- 
receiver wiretap channel to emphasize the presence of more than one confidential message. 
The compound multi-receiver wiretap channel we study here consists of two groups of users 
and a group of eavesdroppers, as shown in Figure [TJ We focus on a special class of com- 
pound multi-receiver wiretap channels which exhibits a certain degradation order. If we 
consider an arbitrary user from each group and an arbitrary eavesdropper, they satisfy a 
certain Markov chain. In particular, we assume that there exist two fictitious users. The 
first fictitious user is degraded with respect to any user from the first group, and any user 
from the second group is degraded with respect to the first fictitious user. There exists a 
similar degradedness structure for the second fictitious user in the sense that it is degraded 
with respect to any user from the second group, and any eavesdropper is degraded with re- 
spect to it. Without eavesdroppers, this channel model reduces to the degraded compound 
broadcast channel studied in [14]. Adapting their terminology, we call our channel model the 
degraded compound multi-receiver wiretap channel. Here, we consider the general discrete 
memoryless version of the degraded compound multi-receiver wiretap channel as well as its 
specializations to the parallel degraded compound multi-receiver wiretap channel, the Gaus- 
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Figure 1: The degraded compound multi-receiver wiretap channel. 

sian parallel degraded compound multi-receiver wiretap channel, and the Gaussian MIMO 
degraded compound multi-receiver wiretap channel. We study two different communication 
scenarios for each version of the degraded compound multi-receiver wiretap channel model. 

In the first scenario, which is illustrated in Figure [2], the transmitter wants to send a 
confidential message to users in the first group, and a different confidential message to users 
in the second group, where both messages need to be kept confidential from the eavesdrop- 
pers. For this scenario, we assume that there exists only one eavesdropper and obtain the 
secrecy capacity region in a single- letter form. While obtaining this result, the presence of 
the fictitious user between the two groups of users plays a crucial role in the converse proof 
by providing a conditional independence structure in the channel, which enables us to define 
an auxiliary random variable that yields a tight outer bound. After establishing single-letter 
expressions for the secrecy capacity region, we consider the parallel degraded compound 
multi-receiver wiretap channel. For the parallel degraded compound multi-receiver wiretap 
channel, we obtain the secrecy capacity region in a single-letter form as well. Though the 
general discrete memoryless degraded compound multi-receiver wiretap channel encompasses 
the parallel degraded compound multi-receiver wiretap channel as a special case, we still need 
a converse proof to establish the optimality of independent signalling in each sub-channel. 
After we obtain the secrecy capacity region of the parallel degraded compound multi-receiver 
wiretap channel, we consider the Gaussian parallel degraded compound multi-receiver wire- 
tap channel. In particular, we evaluate the secrecy capacity region of the parallel degraded 
compound multi-receiver wiretap channel for the Gaussian case, which is tantamount to find- 
ing the optimal joint distribution of auxiliary random variables and channel inputs, which is 
shown to be Gaussian. We accomplish this by using Costa's entropy power inequality [15]. 
Finally, we consider the Gaussian MIMO degraded compound multi-receiver wiretap chan- 
nel, and evaluate its secrecy capacity region when there is only one user in the second group. 
We show the optimality of a jointly Gaussian distribution for auxiliary random variables and 
channel inputs by generalizing our optimization results in [3]. 

In the second scenario we study here, which is illustrated in Figure El the transmitter 
wants to send a confidential message to users in the first group which needs to be kept 
confidential from users in the second group and eavesdroppers. Moreover, the transmitter 
sends a different confidential message to users in the second group, which needs to be kept 
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Figure 2: The first scenario for the degraded compound multi-receiver wiretap channel. 



X 




w 2 X 




-■' Z* 




1st group of users 2nd group of users Eavesdroppers 

Figure 3: The second scenario for the degraded compound multi-receiver wiretap channel. 



confidential from the eavesdroppers. If there were only one user in each group and one 
eavesdropper, this channel model would reduce to the channel model that was studied in [16]. 
However, here, there are an arbitrary number of users in each group and an arbitrary number 
of eavesdroppers. Hence, our model can be viewed as a generalization of [16] to a compound 
setting. Adapting their terminology, we call this channel model the degraded compound multi- 
receiver wiretap channel with layered messages. We first obtain the secrecy capacity region 
in a single-letter form for a general discrete memoryless setting, where again the presence 
of fictitious users plays a key role in the converse proof. Next, we consider the parallel 
degraded compound multi-receiver wiretap channel with layered messages and establish its 
secrecy capacity region in a single-letter form. In this case as well, we provide the converse 
proof which is again necessary to show the optimality of independent signalling in each 
sub-channel. After we obtain the secrecy capacity region of the parallel degraded compound 
multi-receiver wiretap channel with layered messages, we evaluate it for the Gaussian parallel 
degraded compound multi-receiver wiretap channel with layered messages by showing the 
optimality of a jointly Gaussian distribution for auxiliary random variables and channel 
inputs. For that purpose, we again use Costa's entropy power inequality [15]. Finally, 
we consider the Gaussian MIMO degraded compound multi-receiver wiretap channel with 
layered messages, and evaluate its secrecy capacity region when there is only one user in the 
second group. To this end, we show that jointly Gaussian auxiliary random variables and 
channel inputs are optimal by extending our optimization results in [3]. 
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2 System Model 

In this paper, we consider the degraded compound multi-receiver wiretap channel, see Fig- 
ure [U which consists of two groups of users and a group of eavesdroppers. There are K\ 
users in the first group, K 2 users in the second group, and K z eavesdroppers. The channel 
is assumed to be memoryless with a transition probability 

P(vh ■ ■ ■ iVkv Vii ■ ■ ■ > Vk* z u ■ ■ ■ i z k z \x) (1) 

where X G X is the channel input, Yj G yj is the channel output of the jth user in the first 
group, j = 1, . . . , K\, y fc 2 G y\ is the channel output of the kth user in the second group, 
k — 1, . . . , K2, and Z t G Z t is the channel output of the tth eavesdropper, t = 1, . . . , Kz- 

We assume that there exist two fictitious users with observations Y* G y*, Z* G Z* such 
that they satisfy the Markov chain 

X^Y^Y*^Y^Z*^Z U y(j,k,t) (2) 

This Markov chain is the reason why we call this channel model the degraded compound 
multi-receiver wiretap channel. Actually, there is a slight inexactness in the terminology 
here because the Markov chain in (T5]) is more restrictive than the Markov chain 

X^Yj^Yl^Z u V(j,M) (3) 

and it might be more natural to define the degradedness of the compound multi-receiver 
wiretap channel by the Markov chain in ([3]). However, in this work, we adapt the terminology 
of the previous work on compound broadcast channels [14], and call the channel satisfying 
(T5]) the degraded compound multi-receiver wiretap channel. Finally, we note that when there 
are no eavesdroppers, this channel reduces to the degraded compound broadcast channel that 
was studied in [14]. 

2.1 Parallel Degraded Compound Multi-receiver Wiretap Chan- 
nels 

The parallel degraded compound multi-receiver wiretap channel, where each user's and each 
eavesdropper's channel consists of L independent sub-channels, i.e., 
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has the following overall transition probability 



P(vl, ■ ■ ■ ^k^vI, ■ ■ ■ ,V 2 k 2 , zi, ■ ■ ■ , z Kz \x) = Y[p(yl e , y\^ y 2 u , y 2 K2e , z u , • • • , z Kz t\xe) 



e=i 



(7) 



where X e , £ = 1, . . . , L, is the £th sub-channel's input. We define the degradedness of the 
parallel compound multi-receiver wiretap channel in a similar fashion. In particular, we call 
a parallel compound multi-receiver wiretap channel degraded, if there exist two sequences of 
random variables 

Y* = (Y*,...,Y£) (8) 
Z* = (Zl,...,Z* L ) (9) 

which satisfy Markov chains 

X e - Yj -> Y e * -> If, - Z £ * - Zu, V(j, fc, t, €) (10) 

2.2 Gaussian Parallel Degraded Compound Multi-receiver Wire- 
tap Channels 

The Gaussian parallel compound multi-receiver wiretap channel is defined by 

Yj=X + N), j = l,...,K 1 (11) 
Y2 = X + N£, k = l,...,K 2 (12) 
Z t = X + Nf, t = l,...,K z (13) 



where all column vectors {Y}}*, {Yj}*^ {Z,}**, X, {N}}*, {N 2 }^, {Nf }g are of 
dimensions L x 1. {Nj}^, {N 2 .}^, {Nf are Gaussian random vectors with diagonal 
covariance matrices {AH^, {A 2 ,}^, {Af }^f 1; respectively. The channel input X is subject 



3*3 

to a trace constraint as 



E [X T X] = tr [E [XX t ]) < P (14) 

In this paper, we will be interested in Gaussian parallel degraded compound multi-receiver 
wiretap channels which means that the covariance matrices satisfy the following order 

AjdA^Af, y(j,k,t) (15) 
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Since noise covariance matrices are diagonal, the order in (fT5|) implies 

Ai«<A|«<A^> V(j,fc,M) (16) 

where Aj tt , A^ tt ,Af tt denote the £th diagonal element of A], A^, Af, respectively. 

The diagonality of noise covariance matrices also ensures the existence of diagonal ma- 
trices Ay and A* z such that 

A} 1 A y 1 A*=< A*^ Af , V(fc,i,t) (17) 

For example, we can select A y as A ya = max J= i j ... ) ^ 1 Aj tf which already satisfies (TPTI) 
because of max^i,...^ Aj^ < min^i,...^ Aj^ which is due to (TT5T) . Similarly, we can select 
A* z . Thus, for Gaussian parallel compound multi- receiver channels, the two possible ways 
of defining degradedness, i.e., (jSJ) and (j3J), are equivalent due to the equivalence of ffT5l) and 

(ED. 

2.3 Gaussian MIMO Degraded Compound Multi-receiver Wire- 
tap Channels 

The Gaussian MIMO degraded compound multi-receiver wiretap channel is defined by 

Y)=X + N], j = l,...,K! (18) 
Y 2 k = X + Nl k = l,...,K 2 (19) 
Z t = X + Nf, t = l,...,K z (20) 

where all column vectors {Y]}f = \, {Z t }g, X, {N}}f = \, {N?}^, {Nf }£ are of 

dimensions Mxl. {N]}^ 1; {N^}^, {Nf } t= ^ are Gaussian random vectors with covariance 
matrices {S}}jL\, {Ef respectively. Unlike in the case of Gaussian parallel 

channels, these covariance matrices are not necessarily diagonal. The channel input X is 
subject to a covariance constraint 

E [XX T ] r< S (21) 

where S >- 0. 

In this paper, we study Gaussian MIMO degraded compound multi-receiver wiretap chan- 
nels for which there exist covariance matrices £ y and H* z such that 

£} d £y ^ Ef , V(j, M) (22) 

We note that the order in ( |22|) . by which we define the degradedness, is more restrictive than 
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the other possible order that can be used to define the degradedness, i.e., 



Sj^S^Sf, V(j,M) (23) 

In [14], a specific numerical example is provided to show that the order in (|23|) strictly 
subsumes the one in ( 122|) . 

2.4 Comments on Gaussian MIMO Degraded Compound Multi- 
receiver Wiretap Channels 

We provide some comments about the way we define the Gaussian MIMO degraded com- 
pound multi-receiver wiretap channel. The first one is about the covariance constraint in 
(I2T]) . Though it is more common to define capacity regions under a total power constraint, 
i.e., tr (E [XX t ] ) < P, the covariance constraint in ( I2T1) is more general and it subsumes 
the total power constraint as a special case [17]. In particular, if we denote the secrecy 
capacity region under the constraint in ( 12~TT) by C(S), then the secrecy capacity region under 
the trace constraint, tr (E [XX T ] ) < P, can be written as [17] 

C tracc (P) = [J C(S) (24) 

S:tr(S)<P 

The second comment is about our assumption that S is strictly positive definite. This 
assumption does not lead to any loss of generality because for any Gaussian MIMO compound 
multi-receiver wiretap channel with a positive semi-definite covariance constraint, i.e., S >z 
and |S| = 0, we can always construct an equivalent channel with the constraint E [XX T ] ^ 
S' where S' >- (see Lemma 2 of [17]), which has the same secrecy capacity region. 

The last comment is about the assumption that the transmitter and all receivers have 
the same number of antennas. This assumption is implicit in the channel definition, see 
f|T8|) - fl20|) . and also in the definition of degradedness, see fl22|) . However, we can extend 
the definition of the Gaussian MIMO degraded compound multi-receiver wiretap channel to 
include the cases where the number of transmit antennas and the number of receive antennas 
at each receiver are not necessarily the same. To this end, we first introduce the following 
channel model 

Y)=H)X + N), j = \,...,K x (25) 
Y2 = H 2 fc X + N2, k = l,...,K 2 (26) 
Z t = HfX + Nf, t = l,...,K z (27) 

where Hj, H^, Hf are the channel matrices of sizes rj xt,r^x t, rf x t, respectively, and X 
is of size txl. The channel outputs Yj, Y 2 , Z t are of sizes rj x 1, r\ x 1, rf x 1, respectively. 
The Gaussian noise vectors N], N 2 , Nf are assumed to have identity covariance matrices. 
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To define degradedness for the channel model given in (T25l) - (|2TI) , we need the following 
definition from [14]: A receive vector Y a = H a X + N a of size r a x 1 is said to be degraded 
with respect to = H^X + of size r& x 1, if there exists a matrix D of size r a x such 
that DHf, = H a and DD T ^ I. Using this equivalent definition of degradedness, we now 
give the equivalent definition of degradedness for the channel model in fl25l) - fl27j) . To this 
end, we first introduce two fictitious users with observations Y* and Z*, which are given by 

Y* = H^X + (28) 
Z* = H^X + (29) 

The Gaussian MIMO compound multi-receiver wiretap channel in ( T25i) -( l27l) is said to be 
degraded if the following two conditions hold: i) Y* is degraded with respect to any user 
from the first group, and any user from the second group is degraded with respect to Y*, 
and ii) Z* is degraded with respect to any user from the second group, and any eavesdropper 
is degraded with respect to Z*, where degradedness here is with respect to the definition 
given above. 

In the rest of the paper, we consider the channel model given in (I18l) - fl20l) instead of the 
channel model given in f l25l) -(!27l). which is more general. However, if we establish the secrecy 
capacity region for the Gaussian MIMO degraded compound multi-receiver wiretap channel 
defined by (TT8l - (f20l) . we can also obtain the secrecy capacity region for the Gaussian MIMO 
degraded compound multi- receiver wiretap channel defined by (f25l) - (f2~Tj) using the analysis 
carried out in Section V of [14] and Section 7.1 of [3]. Thus, focusing on the channel model 
in (fT8l)- ([20l) does not result in any loss of generality. 

3 Problem Statement and Main Results 

In this paper, we consider two different communication scenarios for the degraded compound 
multi-receiver wiretap channel. 

3.1 The First Scenario: External Eavesdroppers 

In the first scenario, the transmitter wants to send a confidential message to users in the 
first group and a different confidential message to users in the second group, where both 
messages need to be kept confidential from the eavesdroppers. In this case, we assume that 
there is only one eavesdropper, i.e., Kz = 1. The graphical illustration of the first scenario 
is given in Figure [2j 

An (n, 2 nRl , 2 nR2 ) code for the first scenario consists of two message sets Wi = {1, . . . , 
2 nRl }, W2 = {1, . . . , 2 ni?2 }, an encoder / : Wi x W2 — > X n , one decoder for each legitimate 
user in the first group : y^ n — > Wi, j = 1, . . . , K\, and one decoder for each legitimate 
user in the second group g\ : yl' n — > W2, k — 1, . . . , K 2 . The probability of error is defined 
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as 



P e n = max {P^Pf'"} (30) 

where P e 1,n and P e 2,n are given by 

P e'" = i6{ ] S l} Pr W( 1 i 1,B )^ W 'J ^ 

P e 2 '"= max Pr[^K'")^ 2 ] (32) 
fce{i,...,.K2} 

A secrecy rate pair (Ri,R 2 ) is said to be achievable if there exists an (n,2 nRl ,2 nR2 ) code 
which has lim^oo P e n = and 

lim -I(W U W 2 ;Z n ) = (33) 

n— »oc n 

where we dropped the subscript of Z t since K z = 1. We note that (133]) implies 

lim Z n ) = and lim -I{W 2 ; Z n ) = (34) 

n^oo fl n— >oo TJ 

From these definitions, it is clear that we are only interested in perfect secrecy rates of the 
channel. The secrecy capacity region is defined as the closure of all achievable secrecy rate 
pairs. A single-letter characterization of the secrecy capacity region is given as follows. 

Theorem 1 The secrecy capacity region of the degraded compound multi-receiver wiretap 
channel is given by the union of rate pairs (Ri,R 2 ) satisfying 

R x < min I(X; Y}\U,Z) (35) 
j=l,...,Kx 

R 2 < min I(U;Y k 2 \Z) (36) 

fc=l,...,K2 

where the union is over all (U, X) such that 

U ->Y± ^Y* -> Yl -> Z (37) 

for any (j, k) pair. 

Showing the achievability of this region is rather standard, thus is omitted here. We provide 
the converse proof in Appendix |A] The presence of the fictitious user with observation Y* 
proves to be crucial in the converse proof. Essentially, it brings a conditional independence 
structure to the channel, which enables us to define the auxiliary random variable U, which, 
in turn, provides the converse proof. 

As a side note, if we disable the eavesdropper by setting Z = <f>, the region in Theorem [1] 
reduces to the capacity region of the underlying degraded compound broadcast channel which 
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was established in [14]. 



3.1.1 Parallel Degraded Compound Multi- Receiver Wiretap Channels 

In the upcoming section, we will consider the Gaussian parallel degraded compound multi- 
receiver wiretap channel. For that purpose, here, we provide the secrecy capacity region of 
the parallel degraded compound multi-receiver wiretap channel in a single-letter form. 

Theorem 2 The secrecy capacity region of the parallel degraded compound multi-receiver 
wiretap channel is given by the union of rate pairs (i?i,i?2) satisfying 

L 

R,< min \2l(X e ;YUU e ,Z e ) (38) 

L 

^< min Y&\Z e ) (39) 

1=1 

where the union is over all distributions of the form Y\^ = iP{ut, xt) such that 

17, -> X> -> Yjt -> Y e * -> Yl -> Z t (40) 

for any (j, k, £) triple. 

Though Theorem [1] provides the secrecy capacity region for a rather general channel model 
including the parallel degraded compound multi-receiver channel as a special case, we still 
need a converse proof to show that the region in Theorem[T]reduces to the region in Theorem[2] 
for parallel channels. In other words, we still need to show the optimality of independent 
signalling on each sub-channel. This proof is provided in Appendix [B] 



3.1.2 Gaussian Parallel Degraded Compound Multi-Receiver Wiretap Channels 

We now obtain the secrecy capacity region of the parallel Gaussian degraded compound 
multi-receiver wiretap channel. To that end, we need to evaluate the region given in Theo- 
rem EJ i.e., we need to find the optimal joint distribution Yle=iP( u ^ x t)- We first introduce 
the following theorem which will be instrumental in evaluating the region in Theorem [2] for 
Gaussian parallel channels. 

Theorem 3 Let Ni , iV* , , Nz be zero-mean Gaussian random variables with variances 
al,al,a2,o~z, respectively, where 

o\ < o-l < a\ < a| (41) 

Let (U,X) be an arbitrarily dependent random variable pair, which is independent of 
(Ni, N* , N 2 , Nz) , and the second-moment of X be constrained as E[X 2 ] < P. Then, for 
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any feasible (U,X), we can find a P* < P such that 

h(X + N Z \U) - h(X + N*\U) = \ log (42) 

and 

h(X + N Z \U) - h(X + N.IU) > \ log ^4 (43) 

2 P* + o{ 

h{X + N z \U)-h(X + N 2 \U)<\ log (44) 
for any (of, of) satisfying the order in ( EPP - 

Costa's entropy power inequality [15] plays a key role in the proof of this theorem. The proof 
of this theorem is provided in Appendix O 

We are now ready to establish the secrecy capacity region of the Gaussian parallel de- 
graded compound multi-receiver wiretap channel. 

Theorem 4 The secrecy capacity region of the Gaussian parallel degraded compound multi- 
receiver wiretap channel is given by the union of rate pairs (Ri,R 2 ) satisfying 

*^4;H 1+ S)-H 1+ S (45) 

-H l+ Jf^) (46) 

where the union is over all {Pi}f =1 such that ^2i =1 Pe = P and ^ = 1- ^6 [0,1], i — 
1,...,L. 

The proof of this theorem is provided in Appendix [Dl Here, Pi denotes the part of the total 
available power P which is devoted to the transmission in the £th sub-channel. Furthermore, 
(3 i denotes the fraction of the power Pi of the ^th sub-channel spent for the transmission to 
users in the first group. 

3.1.3 Gaussian MIMO Degraded Compound Multi-receiver Wiretap Channels 

In this section, we first obtain the secrecy capacity region of the Gaussian MIMO degraded 
compound multi- receiver wiretap channel when K 2 = 1, and then partially characterize the 
secrecy capacity region for the case K 2 > 1. To that end, we need to evaluate the region 
given in Theorem [TJ In other words, we need to find the optimal random variable pair (U, X). 
We are able to do this for the entire capacity region when there is only one user in the second 
group, i.e., K 2 — 1. For this, we need the following theorem. 
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Theorem 5 Let (Ni,N*,N^) be zero-mean Gaussian random vectors with covariance ma- 
trices Si, £*, respectively, where 



£1 r< £* =< S z (47) 

Lei (Z7, X) fre arbitrarily dependent random vector, which is independent of (Ni,N*,N#) ; 
and /et i/ie second moment of X fre constrained as E [XX T ] S. Then, for any feasible 
(U, X), we can /ind a positive semi-definite matrix K* stzc/i £/iat K* -j< S ; and satisfies 

h(X + Nz|E/) - MX + N*|tf) = ~ log (48) 

and 

/i(X + N^|IT) - /i(X + Ni|J7) > ^ log l^ + ^J (49) 
/or any £i satisfying the order in (42ty- 

The proof of this theorem can be found in [3]. Using this theorem, we can establish the 
secrecy capacity region of the Gaussian MIMO degraded compound multi-receiver wiretap 
channel when K 2 = 1 as follows. 

Theorem 6 The secrecy capacity region of the Gaussian MIMO degraded compound channel 
when K 2 = 1 is given by the union of rate pairs (R\, R 2 ) satisfying 

1 |K + S}| 1 |K + S Z | , . 

R\ < . mm - log - - log (50) 

3=l,...,Ki Z \ljj\ Z \2->Z\ 

1 |S + £ 2 | 1 |S + S Z | 

ito<-logT log 7 7 51 

where we dropped the subscript of Hi since K 2 = 1, and the union is over all positive semi- 
definite matrices K such that K -< S. 

The proof of this theorem is given in Appendix [El 

We now consider the case K 2 > 1. We first note that since the secrecy capacity region 
given in Theorem [T] is convex, the boundary of this region can be written as the solution of 
the following optimization problem 

max min R-ia + u min R 2 u (52) 

(C/,X) j=l,...,tfi 13 ^k=l,...,K 2 K J 
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where R\j and i?2fc are given by 



*8' 


(K) = 




|K + S}| 


— log 


|K + S Z | 
|£z| 


R 2k { 


:k) = 


2 log 


|S + S|| 




|S + S Z | 
|K + S Z | 



fly = I(X;Y)|[/,Z) = /(X;Yj|C/)-/(X;Z|C7) (53) 
fl 2fc = /([/; Y*|Z) = /(£/; Y*) - 1(17; Z) (54) 

respectively, and the maximization is over all (U,X) such that E [XX T ] ^ S. In the sequel, 
we show that jointly Gaussian (17, X) is the maximizer for fl52|) when /j, < 1. To this end, 
we need to consider the optimal Gaussian solution for fl52|) . i.e., the solution of fl52|) when 
(Z7, X) is restricted to be Gaussian. The corresponding optimization problem is 

max min R?JK) + a min fl£(K) (55) 

OdK^S j=l,...,Ki J k=l,...,K 2 

where flf ? (K) and Rf k (K) are given by 

\XC J- V! 1 ! i 11/ i v_l 

(56) 
(57) 

We assume that the maximum for ( 1551) occurs at K = K*, and the corresponding rate pair 
is (Rl,R$, i.e., 

R\ = min R?AK*) (58) 
i£ = min fl^K*) (59) 

fe=l,...,i^2 

The KKT conditions that this optimal covariance matrix K* needs to satisfy are given in 
the following lemma. 

Lemma 1 The optimal covariance matrix for 153]) . K* ; needs to satisfy 

Ki k 2 

A xi (K* + Sj)- 1 - (K* + S z )- X + M = fjL A 2fc (K* + S^)- 1 - ^(K* + H z y l + M 5 

3=1 k=l 

(60) 

where X^/=i = 1; anc ^ — equality if Rfj(K*) > R\; ^2k=i ^ 2fc = 1> anc ^ > 
wi/i equality if Rf k (K*) > R%; and M and M5 are positive semi-definite matrices which 
satisfy K*M = MK* = and (S - K*)M 5 = M S (S - K*) = ; respectively. 



1 With this assumption, we implicitly assume that the maximum in (|55p occurs at a single rate pair 
(i?^,i?2)- I n fact, there might be more than one rate pair where the maximum occurs. Even if this is the 
case, we can simply consider only one of them, since our ultimate goal is to show that the maximum in (|52p 
is equal to the maximum in (|55|) . 
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The proof of this lemma is given in Appendix [Fl 

To show that both (152|) and (|55|) have the same value when /i < 1, we use the following 
optimization result due to [14]. 

Lemma 2 ([14], Lemma 2) Let U,X, {Nj}^, {Nj^fj^N^ 6e as denned &e/ore. JTie/oZ- 
lowing expression 



is maximized by jointly Gaussian (C/,X) u>aen /i < 1. Furthermore, the optimal covariance 
matrix needs to satisfy where M and Mg are as iae?/ are defined in LemmaUi 

In [14], a weaker version of this lemma is proved. This weaker version requires the 
existence of a covariance matrix K* for which the Lagrange multiplier M in (1601) is zero. 
However, using the channel enhancement technique [17], this requirement can be removed. 
Using Lemma[2]in conjunction with Lemma[T], we are able to characterize the secrecy capacity 
region partially for the case K-2 > 1. 

Theorem 7 The boundary of the secrecy capacity region of the degraded Gaussian MIMO 
compound multi-receiver wiretap channel is given by the solution of the following optimization 
problem 



for /i < 1 . That is, for this part of the secrecy rate region, jointly Gaussian auxiliary random 
variables and channel inputs are optimal. 

The proof of this theorem is given in Appendix [Fl 

3.2 The Second Scenario: Layered Confidential Messages 

In the second scenario, the transmitter wants to send a confidential message to users in 
the first group which needs to be kept confidential from the second group of users and 
eavesdroppers. The transmitter also wants to send a different confidential message to users in 
the second group, which needs to be kept confidential from the eavesdroppers. As opposed to 
the first scenario, in this case, we do not put any restriction on the number of eavesdroppers. 
The graphical illustration of the second scenario is given in Figure El The situation where 
there is only one user in each group and one eavesdropper was investigated in [16]. Hence, 
this second scenario can be seen as a generalization of the model in [16] to a compound 
channel setting. Following the terminology of [16], we call this channel model the degraded 
compound multi-receiver wiretap channel with layered messages. 




AyMX + N il^) - »Y1 Asfc/i ( X + N l\ U )-( 1 - /^(X + N Z \U) 



(61) 



j=l k=l 



max min R?JK) + u min R? h (~K) 

(MKHS j=l,...,Kx J k=l,...,K 2 



(62) 
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An (n, 2 nRl , 2 ni?2 ) code for the degraded compound multi-receiver wiretap channel with 
layered messages consists of two message sets Wi = {1, . . . , 2 nRl }, W 2 = {1, . . . , 2 nR ' 2 } and 
an encoder / : Wi x W 2 — > <-f n , one decoder for each legitimate user in the first group 
: ^j' n ~^ J — 1 ? • • - : ^1 5 an d one decoder for each legitimate user in the second group 
9k '■ yt' n ~ * ^2 j k — 1, . . . , K 2 . The probability of error is defined as 

P e " = max{P e 1,n , P 2 ' n } (63) 

where Pg 1 '™ and P 2 ' n are given by 

P} ,n = max Pr + Wi] (64) 

je{i,...,JCi} 

P e 2 '" = max Pr [^(if") ^ W 2 ] (65) 
fee{i,...,K 2 } 

A secrecy rate pair is said to be achievable if there exists an (n, 2 nRl , 2 nR2 ) code which has 
linw.P^O, 

lim -I(W 2 ; Z?) = 0, t = l,...,K z (66) 



and 



lim -/(^ i; y fc 2 ' n |V^ 2 ) =0, k = l,...,K 2 (67) 

ra— >oo n 

We note that these two secrecy conditions imply 

]im-I(W 1 ,W 2 ;Z?) = 0, t = l,...,K z (68) 

n^oo n 

Furthermore, it is clear that we are only interested in perfect secrecy rates of the channel. 
The secrecy capacity region is defined as the closure of all achievable secrecy rate pairs. A 
single-letter characterization of the secrecy capacity region is given as follows. 

Theorem 8 The secrecy capacity region of the degraded compound multi-receiver wiretap 
channel with layered messages is given by the union of rate pairs (Ri,R 2 ) satisfying 



R x < . min J(X;F/|/7,y fc 2 ) (69) 

3=1,.. .,Ki J 
k=l,...,K 2 



R 2 < min I(U;Y k 2 \Z t ) (70) 

fe=l,...,K2 
t=l,...,K z 

where the union is over all random variable pairs (U, X) such that 

U -> X -> -> y* -> y fe 2 -> z* -> z t (71) 
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for any triple (j, k, t). 

The proof of this theorem is given in Appendix [G] Similar to the converse proof of Theorem [TJ 
the presence of the fictitious users Y* and Z* plays an important role here as well. In 
particular, these two random variables introduce a conditional independence structure to 
the channel which enables us to define the auxiliary random variable U that yields a tight 
outer bound. Despite this similarity in the role of fictitious users in converse proofs, there 
is a significant difference between Theorems [T] and El in particular, it does not seem to be 
possible to extend Theorem [T] to an arbitrary number of eavesdroppers, while Theorem [8] 
holds for any number of eavesdroppers. This is due to the difference of two communication 
scenarios. In the second scenario, since we assume that users in the second group as well as 
the eavesdroppers wiretap users in the first group, we are able to provide a converse proof 
for the general situation of arbitrary number of eavesdroppers. 

As an aside, if we set K\ = K 2 = Kz = 1, then as the degraded compound multi-receiver 
wiretap channel with layered messages reduces to the degraded multi-receiver wiretap channel 
with layered messages of [16], the secrecy capacity region in Theorem [8] reduces to the secrecy 
capacity region of the channel model in [16]. 

3.2.1 Parallel Degraded Compound Multi-receiver Wiretap Channels with Lay- 
ered Messages 

In the next section, we investigate the Gaussian parallel degraded compound multi-receiver 
wiretap channel with layered messages. To that end, here we obtain the secrecy capacity re- 
gion of the parallel degraded compound multi-receiver wiretap channel with layered messages 
in a single-letter form as follows. 

Theorem 9 The secrecy capacity region of the parallel degraded compound multi-receiver 
wiretap channel with layered messages is given by the union of rate pairs (Ri,R 2 ) satisfying 

L 

R x < . min Wit Y h\ U ii Y u) (72) 

j=l,...,Ki *— » J 
k=l,...,K 2 i=1 

L 

^< min HUf, YftZu) (73) 

t=l,...,K z e=1 

where the union is over all Yle=iP( u e^ x i) suc ^ that 

Ue ->X e ^ Y} t - Y; - Y k 2 e - Z\ - Zu (74) 

for any (£,j,k,t). 

Since parallel degraded compound multi-receiver wiretap channels with layered messages 
is a special case of the degraded compound multi-receiver wiretap channel, Theorem [8] implic- 
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itly gives the secrecy capacity region of parallel degraded compound multi-receiver wiretap 
channels with layered messages. However, we still need to show that the region in Theorem [8] 
is equivalent to the region in Theorem EE That is, we need to prove the optimality of inde- 
pendent signalling in each sub-channel. The proof of Theorem [9] is provided in Appendix [H] 

3.2.2 Gaussian Parallel Degraded Compound Multi-receiver Wiretap Channels 
with Layered Messages 

We now obtain the secrecy capacity region of Gaussian parallel degraded compound multi- 
receiver wiretap channels with layered messages. To that end, we need to evaluate the region 
given in Theorem [9] i.e., we need to find the optimal distribution Yle=iP{ u t> x t)- We first 
introduce the following theorem, which is an extension of Theorem [3j 

Theorem 10 Let N%, N* , N 2 , N, Nz be zero-mean Gaussian random variables with variances 
a 2 , of, a 2 , o -2 , o\, respectively, where 

°\ < °l < o\ <~o 2 <a\ (75) 

Let (U, X) be an arbitrarily dependent random variable pair, which is independent of 
(Ni, N* , N 2 , N, Nz), and the second moment of X be constrained as E [X 2 ] < P. Then, 
for any feasible (U,X), we can find a P* < P such that 

h(X + N\U) - h(X + N* I U) = \ log ( 76 ) 

2 P* + a; 

and 

h(X + Nz \U)-h(X + N 2 \U)<l log ^4 (77) 

h(X + N 2 \U)-h(X + N 1 \U)>l log ^4 (78) 

for any (af , erf, erf) satisfying the order in (75]]. 

The proof of this theorem is given in Appendix [H The proof of this theorem basically relies 
on Theorem [3] and Costa's entropy power inequality [15]. 

Using this theorem, we can establish the secrecy capacity region of the Gaussian parallel 
degraded compound multi-receiver wiretap channel with layered messages as follows. 

Theorem 11 The secrecy capacity region of the Gaussian parallel degraded compound multi- 
receiver wiretap channel with layered messages is given by the union of rate pairs {Ri,R 2 ) 
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satisfying 

*-^l?H i+ ft^)~H i+ «^) (80) 

where fa = l—pe G [0,1], £ = 1, . . . ,L, and the union is over all {Pi}f =1 such that Yle=i ^ = 
P. 

The proof of this theorem is given in Appendix[J] Similar to TheoremHJ here also, Pi denotes 
the amount of power P devoted to the transmission in the fth sub-channel. Similarly, fa is 
the fraction of the power P% of the £th sub-channel spent for the transmission to users in the 
first group. 



3.2.3 Gaussian MIMO Degraded Compound Multi-receiver Wiretap Channels 
with Layered Messages 

We now obtain the secrecy capacity region of the Gaussian MIMO degraded compound 
multi-receiver wiretap channel with layered messages. To that end, we need to evaluate the 
region given in Theorem [8], i.e., find the optimal random vector pair (U, X). We are able to 
find the optimal random vector pair (U, X) when there is only one user in the second group, 
i.e., K2 = 1. To obtain that result, we first need the following generalization of Theorem [51 

Theorem 12 Let (Ni, N2, N*, Nz) be Gaussian random vectors with covariance matrices 
Si, S 2 , S*, Hz, respectively, where 

Si r< S 2 r< S* ^ S z (81) 

Let (t/,X) be an arbitrarily dependent random vector pair, which is independent of 
(Ni, N2, N*, Nz), and the second moment of X be constrained as E [XX T ] ^ S. Then, 
for any feasible (U, X), there exists a positive semi-definite matrix K* such that K* ^ S ; 
and it satisfies 

MX + N*|tf) - h(X + N 2 \U) = ~ log (82) 

and 

h(X + N*|t/) - fr(X + N 2 |C0 < \ log (83) 
MX + N 2 1 17) - MX + Ni I CT) > § log j (84) 
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for any (Si, S^) satisfying the order in (fffl]). 

The proof of this theorem is given in Appendix [L] Using this theorem, we can find the 
secrecy capacity region of the Gaussian MIMO degraded compound multi-receiver wiretap 
channel with layered messages when K 2 = 1 as follows. 

Theorem 13 The secrecy capacity region of the Gaussian MIMO degraded compound multi- 
receiver wiretap channel with layered messages when K 2 = 1 is given by the union of rate 
pairs (RijRz) satisfying 

1. |K + SJ| 1. |K 



^2 1 

I, |S + S 2 | 1 IS + Sf 



R2 ^ t =tX 2 log WTW\ - 2 log W+W\ m 

where the union is over all positive semi-definite matrices K such that K ^ S. 

The proof of this theorem is given in Appendix [Ml As an aside, if we set K\ = Kz = 1 
in this theorem, we can recover the secrecy capacity region of the degraded multi-receiver 
wiretap channel with layered messages that was established in [16]. 



4 Conclusions 

In this paper, we studied two different communication scenarios for the degraded compound 
multi-receiver wiretap channel. In the first scenario, the transmitter wants to send a confi- 
dential message to users in the first group, and a different confidential message to users in 
the second group, where both messages are to be kept confidential from an eavesdropper. We 
establish the secrecy capacity region of the general discrete memoryless channel model, the 
parallel channel model, and the Gaussian parallel channel model. For the Gaussian MIMO 
channel model, we obtain the secrecy capacity region when there is only one user in the 
second group. We also provide a partial characterization of the secrecy capacity region when 
there are an arbitrary number of users in the second group. 

In the second scenario we study, the transmitter sends a confidential message to users in 
the first group which is wiretapped by both users in the second group and eavesdroppers. 
In addition to this message sent to the first group of users, the transmitter sends a different 
message to users in the second group which needs to be kept confidential only from the 
eavesdroppers. In this case, we do not put any restriction on the number of eavesdroppers. 
As in the first scenario, we establish the secrecy capacity region for the general discrete 
memoryless channel model, the parallel channel model, and the Gaussian parallel channel 
model. For the Gaussian MIMO channel model, we obtain the secrecy capacity region when 
there is only one user in the second group. 
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Appendices 



A Proof of Theorem [T] 

Achievability is clear. We provide the converse proof. For an arbitrary code achieving the 
secrecy rates (Ri, R2), there exist (ei <n , 62, n ) an d 7n which vanish as n — > 00 such that 

H(W 1 \Yl' n )<ne 1>n , j = l,...,K 1 (87) 
H{W 2 \Y^ n ) < ne 2 ^ k=l,...,K 2 (88) 
I(W 1 ,W 2 ;Z n )<n ln (89) 

where flHTj) and flHBl are due to Fano's lemma, and fl89|) is due to the perfect secrecy require- 
ment stated in (|33|) . 

We define the following auxiliary random variables 

Ui = W 2 Y*' i ~ l Z? +1 , i = l,...,n (90) 

which satisfy the following Markov chain 

Xj.^Xi^Y^^Y^Yl^Zi, i=l,. ..,n (91) 

for any (j, k) pair. The Markov chain in ( 19TT) is a consequence of the fact that the channel 
is memoryless and degraded. 
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We first bound the rate of the second message: 

nR 2 = H{W 2 ) (92) 

< I(W 2 - Y k 2 ' n ) + ne 2 , n (93) 

< I(W 2 ; Y k 2 ' n ) - I(W 2 ; Z n ) + n(e 2 , n + 7n ) (94) 
= I{W 2 - Y 2 > n \Z n ) + n(e 2 , n + 7n ) (95) 

n 

= HW 2 ; Y^-\ Z n ) + n(e 2 , n + 7n ) (96) 

i=l 
n 

= £ HW 2 ; Y 2 t \Y 2 ^\ Z? +1 , Z % ) + n(e 2 , n + 7n ) (97) 

i=l 
n 

< J2 KY k 2 ' l -\ Z? +v W 2 ; Y 2 t \Z t ) + n(e 2 , n + 7n ) (98) 

n 

< ^ HY*' l ~\ Y^-\ Z? +1 , W 2 ; Y^Z,) + n(e 2 , n + 7n ) (99) 
i=i 

n 

= 52 HY*' l -\ Z? +1J W 2 - Y^Z,) + n(e 2 , n + 7n ) (100) 

2=1 

n 

= Y U Z *) + "K" + In) (101) 

i=l 

where ( 1931) is due to ( 1881) . ( f94"l) is a consequence of ( 1891) . ( f9~5i) comes from the Markov chain 

V^ 2 ^ if" ^ Z n , k = l,...,K 2 (102) 

which is a consequence of the fact that the channel is degraded, f|9T|) comes from the Markov 
chain 

Zi-^Y^ 1 - 1 ^(Y k %,Z?,W 2 ), k = l,...,K 2 (103) 

which is due to the fact that the channel is degraded and memoryless, and (11001) is a conse- 
quence of the Markov chain 

Y^^Y^^iW^Z^Y^), k = l,...,K 2 (104) 

which is due to the Markov chain in ([2]) and the fact that the channel is memoryless. 
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Next we bound the rate of the first message: 

nRx = H{W X ) (105) 

= H(W 1 \W 2 ) (106) 

<I{W 1] Y^ n \W 2 )+ne 1>n (107) 

< I{W X] Y}' n \W 2 ) - /(Wi; Z n \W 2 ) + n(e ltn + 7n ) (108) 
= 7(W i; Yl' n \W 2 , Z n ) + n(ei, n + ln ) (109) 

n 

= £ I(W i; Y^\W 2 , Z n , y/' 4 " 1 ) + n(e hn + ln ) (110) 

i=l 
n 

= W; Y^\W 2 , Z? +1 , Y^-\ Zi) + n(e l>n + ln ) (111) 

8=1 

n 

= Y, H^i] Y U W ^ *} M ~\ y*' <_1 , ^) + n(ei, n + 7n) (H2) 
i=i 

n 

< £ JpQ, W^; l£|W 2 , Z? +1 , Y^-\ Y**-\ Z % ) + n(e hn + 7n ) (113) 

i=l 

= Yjj\W 2 , Zf +1 , f^.r-- 1 , Z 4 ) + n(e lin + 7n ) (114) 

i=i 

n 

i=i 

+ n(e lin + 7 „) (115) 

n 

< YH(Yl i \W 2 ,Z^ l ,Y*' i -\Z i )-H(Yl i \W 2 ,Z^ +1 ^ 

i=l 

+ n{e 1>n + ln ) (116) 

n 

= H{Yy\W 2l Zf +1 , F* ,i_1 , Z«) - ff^lWa, Z™ +1 , l"*'* -1 , Z 4j X,) + n(e ljn + ln ) 
i=i 

(117) 

n 

= ^ /(X,; l£| W 2 , Zf +1 , y*'*" 1 , Z<) + n(e 1>n + Tn ) (118) 

8=1 
n 

= ^ /(X,; Z) + n(e 1>n + 7 n) (119) 



8=1 



where ( 11071) is due to (JS"7|) . (j!08j) is a consequence of f[8"9"l) . (11091) comes from the Markov 
chain 



(H^H^-y^-Z", 7 = 1,...,^ (120) 



' J 
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which is due to the fact that the channel is degraded, (111 II) comes from the Markov chain 



Zr-^Y^-^iW^W^Y^Z?), j = l,...,K 1 (121) 

which is a consequence of the fact that the channel is degraded and memoryless, f 1 1 1 2 j) follows 
from the Markov chain 

yM-i _> yM -1 — > (Wi, W 2 , Y^, Z™), j = l,...,K 1 (122) 

which results from the Markov chain in (|2j) and the fact that the channel is memoryless, 
(11141) is a consequence of the Markov chain 

{yy, Zi) - x % - (y^- 1 , y/-^ 1 , wi, w 2 ), j = 1, . . . , ^ (123) 

which is due to the fact that the channel is memoryless, (11161) comes from the fact that 
conditioning cannot increase entropy, and fll 17j> is again due to the Markov chain in (I123p . 

Next, we define a uniformly distributed random variable Q G {l,...,n}, and U = 
(Q,U Q ),X = X Q ,Y* = Y} Q ,Y% = y fc 2 Q , and Z = Z Q . Using these definitions in ffTOll 
and flll9p . we obtain the single- letter expressions in Theorem [TJ 



B Proof of Theorem 2 



The achievability of this region follows from Theorem[T]by selecting (U, X) = (Ui,X±, . . . ,Ul, 
Xl) with a joint distribution of the product form p(u,x) = Yli=iP( u ti x i)- We next provide 
the converse proof. To that end, we define the following auxiliary random variables 

U e;i = W 2 Y*' i - 1 Z^ 1 Y l l 1 _ 1] .Z [t+1:L] ^ i = l,...,n, £ = 1,...,L (124) 

which satisfy the Markov chain 

Ue,i —> Xtj — >■ z i,i) ( 125 ) 

for any (J, k, £) triple because of the facts that the channel is memoryless and sub-channels 
are independent. 

We bound the rate of the second message. Following the same steps as in the converse 
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proof of Theorem [TJ we get to (I97|) . Then, 

n 

nR 2 < ]T I(W 2 ; Y^Y^- 1 , Z? +1 , Z t ) + n(e 2 , n + 7n ) (126) 

i=l 
n L 

= E E n 2 ' i_1 , ^ ^ nVn,*) + + in) 

n L 

= E E J (^; ^ll? 4 - 1 , %fl:L],i, nll^-l],^ + + 7n) (128) 

8=1 £=1 

< E E / ( y fc'" 1 ' Z "+i> %^^> ^M-iLi. ^ ^,1%) + n(e 2 , n + 7 „) (129) 
i=i 1=1 

n L 

< EE^C^ 2 '* \ ^1+1> %+l:L],i) ^ 2 [1 ; W],»' ^M,tl%) 
i=l 1=1 

+ n(e 2>n + ln ) (130) 

n L 

= E E / ( y *'" 1 ' Z ^W. ^i],* W * Y h\ Z ^) + n ^ + In) (131) 

i=l 1=1 
n L 

= E E 7 (^i + ^( e 2,n + 7n) (132) 

i=l £=1 

where (11281) follows from the Markov chain 

Z[l:£-l] t i — > ^fc 2 [l:£_l],i — >■ (W 2 , F fc 2,J , ZJYl, %:L],i, ifc^i) (133) 

which is a consequence of the facts that the channel is degraded and memoryless, and sub- 
channels are independent, and (11311) is due to the Markov chain 

(Y k 2 ' l -\ - (V*' 4 " 1 , Y^) -> (W 2 , Z? +1 , Z [e:LU , Y k %) (134) 

which is a consequence of the Markov chain in (llOp and the facts that the channel is memo- 
ryless and sub-channels are independent. 

We next bound the rate of the first message. Again, following the same steps as in the 
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converse proof of Theorem [TJ we get to (11111) . Then, 

n 

nRi < E HWi] Y^\W 2 , Y^-\ Z? +1 , Z t ) + n(e lttl + ln ) (135) 

8=1 
n L 

8=1 1=1 
n L 

= E E i?'*" 1 , *sW-im, + ™( e ^ + 70 (137) 

8=1 £=1 
n L 

= E E ^C^ 1 ' ^Ail^ 2 ' ^j 1 ' 1 ^ 1 ' ^i+l' Y j[l:£-l],V Y [1U-1],H Z [£+l:L],i, Ze,i) 

8=1 £=1 

+ «(ei,n + 7n) (138) 

n L 

< E E ^(^*' ^,il^2, 1 ) F*''" 1 , ^x.^!]^, Fp^-l],*, %+l:L],i, Zy) 

8=1 1=1 

+ ™(ei,n + 7n) (139) 

n L 

= E E ^(-^>*i Y J 1 ^ i \W 2 , F/'* \ l^!-^!]^, F[*^-l],i, ^+l:L],i, 

8=1 £=1 

+ n(ei, n + 7n) (140) 

n L 

= E E H &ki\ W * *?[Uf-l],«' ^-l],^ 

8=1 £=1 

~ HiXji ) i\W 2 , Yj' i , Y*' 1 1 , Z? +1 , Y^ VI _ X ^ Y^. t _ x ^ Z[£ + i- L ] t i, Ze ti , Xi ti ) 

+ n(e lin + 7n ) (141) 

n L 



^ E E # Q&l ^ F*' l ~\ Z? +1 , Y^, Z [e+1:L]4 , Z t , 



i=i i=i 

~ H(Y^ i \W 2 , Yj ,% , Y*' 1 ' 1 , Zi +V 

+ n(ei, n + 7n) (142) 



n L 



8=1 £=1 

- F(F.y W 2; F*^\ YJ^, Z [£+1:L]ii , %, X e ,) + n(e 1>n + 7 n) (143) 

n L 

E J ( X ^ ^\ Z? +1 , Yfa_ 1]ti , Z [e+1:L] „ Z A <) + n(e 1>n + 7b ) (144) 



8=1 €=1 
n L 



E ^ I(X £ti] Y^\U iA , Z iA ) + n(ei, n + j n ) (145) 



8=1 <=1 
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where (11371) follows from the Markov chain 



Z^e-m - r/pw-n,* - (W 1: W 2 , Y^-\Z^, Yfa, Z [tL]A ) (146) 

which is due to the facts that the channel is degraded and memoryless, and sub-channels are 
independent, (11381) comes from the Markov chain 

{Y^-\ Y( VI _ l]tl ) -> (Y^-\ Yl [1:£ _ l]ti ) -> (W u W 2 , Z? +1 , Z m>i , Y^) (147) 

which results from the Markov chain in (jTUI) and the facts that the channel is memoryless, 
and sub-channels are independent, (11 40 ft comes from the Markov chain 

(Y^, %) - X e>i -> (Wi, Y^~\ Y^~\ Z? +1 , ift^, Yfa_ 1]fi , Z [e+1 .. L]>i ) (148) 

which is a consequence of the facts that the channel is memoryless, and sub-channels are 
independent, (1 142ft results from the fact that conditioning cannot increase entropy, and (11431) 
is due to the Markov chain in (11481) . 

Next, we a define a uniformly distributed random variable Q G {1, . . . ,n}, and Ui = 
(Q,Ut t Q),X = Xt,Q,Yj t = Yj lQ ,Yl t = Y£ eQ , and Z e = Z ijQ . Using these definitions in 
(11321) and (11451) . we obtain the single-letter expressions in Theorem [2j Finally, we note that 
although auxiliary random variables {Ui}f =1 are dependent, their joint distribution does not 
affect the bounds in Theorem [2j Thus, without loss of generality, we can select them to be 
independent. 



C Proof of Theorem [3 



We first note that 



I a 



I < h(X + N*\U) - h(X + N Z \U) <\\og — 



(149) 



where the right-hand side can be shown via the entropy power inequality [18, 19]. To show 
the left-hand side, let us define a Gaussian random variable iV with variance cr| — of, and 
independent of (U, X, N*). Thus, we can write down the difference of differential entropy 
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terms in (11491) as 



h(X + N*\U) - h{X + N Z \U) = h(X + N*\U) - h(X + N* + N\U) (150) 

= —I(N; X + N* + N\U) (151) 

= -h(N\U)+h(N\U,X + N* + N) (152) 

> -h(N\U) + h(N\U,X + N* + N,X) (153) 

= -h(N) + h(N\N* + N) (154) 

2 bg ^ 



o log Hr (155) 



where (I153P is due to the fact that conditioning cannot increase entropy and (11541) is a 
consequence of the fact that (U,X) and (N*,N) are independent. 
Equation (11491) implies that there exists P* such that P* < P and 



h(X + N*\U) - h(X + N Z \U) = \ log ^4 (156) 



which will be used frequently hereafter. 

We now state Costa's entropy power inequality [15] which will be used in the upcoming 
prooM 

Lemma 3 ([15], Theorem 1) Let(U,X) be an arbitrarily dependent random variable pair, 
which is independent of N, where N is a Gaussian random variable. Then, we have 



e 



2h(X+VtN\U) 



> (1 - t)e 2h ^ + te 2h(x+N W, 0<t<l (157) 



We now consider (|43|) . We first note that we can write iV* as 

N* = Ni + VhNi (158) 

where Ni is a Gaussian random variable with variance o~\ — a\ , which is independent of 
(U,X,N!). h in (TT581 is given by 

2 2 

h = a ; ~ °\ (159) 

where it is clear that t\ G [0,1]. Using (11581) and Costa's entropy power inequality [15], we 



2 Although, Theorem 1 of [15] states the inequality for a constant U, using Jensen's inequality, the current 
form of the inequality for an arbitrary U can be shown. 
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get 



e 2h(X+N*\U) = ^hiX+Ni+VtlN^U) ^ 16 q^ 
> (1 _ tl)e 2h{X + N,\U) + tie 2h(X + N z \U) (161) 



which is equivalent to 



(1 _ ti y2[h{X+m\U)-h{X+N z \U)) +fl< e 2[h(X+N*\U)~h(X+N z \U)} 

P* + ol 



" P* + a 2 z 

where (11631) is obtained by using (I156p . Equation H163[) is equivalent to 



(163) 



1 . 1 ( P* 



h(X + N^U) - h(X + N Z \U) < - log — [j^t -h) ( 16 I) 



2 



1 . P* + a\ 
= 2 l0g P^ 

where we used the definition of t\ given in (11591) to obtain (11661) . Equation ( 11661) proves f j43l ). 
We now consider (jSj). First, we note that we can write N 2 

N 2 = N* + VhN z (167) 

where Nz is a Gaussian random variable with variance cr| — a 2 , which is independent of 
([/, X, iV*). t 2 in (TT67D is given by 

t2 = 4^4 ( 168 ) 

where it is clear that t 2 G [0, 1]. Using (11671) and Costa's entropy power inequality [15], we 
get 

e 2h(X+N 2 \U) _ e 2h(X+N*+VhNz\U) ^gg) 

> (1 _ t2 yh(X+N*\U) + t2e 2h(X+N z \U) (17Q) 
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which is equivalent to 



3 2[h(X+N 2 \U)-h(X+N z \U)] > (1 _ t \ e 2[h{X+N*\U)-h(X+N z \U)] + ^ 

P* + al 



:i-*0-5^ + fc (172) 



J3* I ^2 



Z 



P* + a% 



(173) 



where fll 73j) is obtained by using the definition of t 2 given in (11681) . Equation (I173P is 
equivalent to 

h(X + Nz \U)-h{X + N 2 \U)< 1 - log ^q^t (174) 
which is (I44p . This completes the proof of Theorem [3J 



D Proof of Theorem [4 

Achievability is clear. We provide the converse proof. To this end, let us fix the distribution 
Y\j.=i p{ u l-> x i) such that 

E[Xj]=P t , £=1,...,L (175) 

and Y^£=i — P- We nTS ^ establish the bound on R 2 given in fj46l) . To this end, we start 
with (1391 . Using the Markov chain Ue_ — ► Y^ e — > Z^, we have 



^ < „ min Ym) ~ m; Z e ) (176) 

L 

= k Jf?X E P*C*m) - *W] + IKW - h(Y k 2 e \U)} (177) 



i=i 

L 



^ k ™\ E ? lQ S p t A** + - h (Y k \\U)] (178) 

where (11781) comes from the fact that Gaussian maximizes 

- HZ e ) (179) 

which can be shown via the entropy power inequality [18,19]. We now use Theorem [3j For 
that purpose, we introduce the diagonal covariance matrix A* which satisfies 

A • =< A* =< A' (180) 
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for any (J, k) pair, and in particular, for the diagonal elements of these matrices, we have 



A),« < A« < Ala (181) 

for any triple (j,k,£). Thus, due to Theorem [3j for any selection of {(Ug, Xg)}f =1 , there 
exists a P/ such that 

Ft < p i (182) 

h(Z e \U e ) - h(Yf £ \U e ) > ilo 8 5±^g (183) 

h(Z e \U e ) - h{Y k \\U,) < llog P l + Az / e (184) 

for any triple (J, k,£). Using (I184p in (11781) . we get 

D , . 1 i p e + A 2 kU 1 P e + A zu 

R 2 < mm > -log — ^ log ; (185) 

2 -k=i,..., K2 ^2 *p; + a% u 2 S P/ + Az,« 1 ; 

We define P/ = p e P e and & = 1 - /3 t , i = 1, . . . , L, where p e G [0, 1] due to (ITSD . Thus, we 
have established the desired bound on i?2 given in ()46l) . We now bound R\. We start with 
f l38l) . Using the Markov chain (Ui,Xt) — > — > Z^, we have 



i?i < min V/^Kill/^-/^;^) (186) 

7=l,...,A'i * ' 

1=1 

min £ ^Y/^) - h(Z e \Ut) - ~ log ^ (187) 



El , Fc + A„- «« 1 A.- « 
- log -= -^L - - log 188 



,=i,...,x 1 ^2 °P; + A Z m 2 °A 



2,i 



where (11881) comes from (11831) . Since we defined P/ = PePe, (11881) is the desired bound on 
Pi given in (|45|) . completing the proof. 



E Proof of Theorem 6 



The main tools for the proof of Theorem [6] are Theorem [5j and the following so-called worst 
additive noise lemma [20,21]. 

Lemma 4 Let N be a Gaussian random vector with covariance matrix E, and be a 
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positive semi- definite matrix. Consider the following optimization problem, 



min 7(N;N + X) s.t. Cov(X) = K x (189) 

p(x) 

where X and N are independent. A Gaussian X is the minimizer of this optimization 
problem. 

We first bound i?2- Assume we fixed the distribution of (U, X) such that Cov(X) = Kx- 
Then, we have 

R2 < I(U ; Y 2 ) — I(U ; Z) (190) 

= fr(Y 2 ) - h(Z) + [/i(Z|C7) - /i(Y 2 |t/)] (191) 

< \ log + [h(Z\U) - h(Y 2 \U)} (192) 

To show ( 1 192f) . consider N which is a Gaussian random vector with covariance matrix Hz — 
£ 2 , and is independent of (U, X,N 2 ). Thus, we can write 

h(Y 2 ) - h(Z) = h(Z\N) - h(Z) (193) 

= -J(N;X + N 2 + N) (194) 

<-log ) T Kx+ ^ 21 , (195) 



2 °|K X + E Z , 
2 |S + S Z | 



<*l°g#^l (196) 



where (11951) is due to Lemma HJ and fl 1 9 6 [) follows from the fact that 

|A| |A + A| 

|A + B| - |A + B + A| ^ 197 ' ) 

for A y 0,B y 0, A y [3,17]. 

For the rest of the proof, we need Theorem [5j According to Theorem [5], for any (U, X), 
there exists a ^ K ^ Cov(X|£7) such that 



fc(Z|l7) - h(Y 2 \U) = ^log ^±|g (198) 



h{Z\U) - h{Y)\U) >\\og ^^ , 3 = 1,...,^ (199) 
because ^ S 2 , j = 1, . . . , K x . Using flT98|) in fTl92|) yields 



(200) 
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which is the desired bound on R 2 . 

The desired bound on Rx can be obtained as follows 



R x < min I(X;Y)\U) - I(X;Z\U) (201) 

j=l,...,K 1 

= . min h(Yj\U) - h(Z\U) - \ log jSj (202) 

1 |K + S}| 1 |£}| , 
< mm - log | - - log — (203) 

1, |K + S)| 1, |K + E Z | , , 

j=l,. ..,Ki 2 I-2-j - 1 Z l-^Zl 

where (12031) is due to (11991) . This completes the proof of Theorem [61 

F Proofs of Lemma H and Theorem ffl 
F.l Proof of Lemma [I] 

The optimization problem in ( 1551) can be put into the following alternative form 

max a + ub (205) 

s.t. #g(K)>a, j = l,...,K 1 (206) 
/? 2 G fc (K) > £>, fc = l,...,K 2 (207) 

which has the Lagrangian 

£(K) = a + fib + X U (^S( K ) ~o)+fiJ2 X2k ( R 2k( K ) ~ b )+ tr(KM) 
i=i fc=i 

+ tr((S-K)M s ) (208) 
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where M and Ms are positive semi-definite matrices, and {\ij}j=i and {A2/ C } fc 2 1 are non- 
negative. The KKT conditions are given by 

(209) 

(210) 
(211) 

j = l,...,Kx (212) 
k = l,...,K 2 (213) 

(214) 
(215) 

The KKT conditions in f )209p and (I210p yield Y^fli = 1 an d J2k=i ^2fc = 1, respectively 
Furthermore, the KKT conditions in (12121 and (I2I3J1 imply Ay = when #g(K*) > R* x 
and A2A: = when Rf k (K*) > R?,, respectively. The KKT condition in (12111) results in 
fl60|) . Finally, since tr(AB) = tr(BA) > when A >z and B >z 0, we need to have 
K*M = MK* = and (S - K*)M S = M S (S - K*) = 0. 



ac(K) 

da 



a=R\ 



dC(K) 

db 


6=R* 


= 


v k £(k)| k =k* 


= 


A y (J2g(K*)- 




= 


A»(iS(K*) - 




= 


tr(K*M) 


= 


tr((S-K*)M s ) 


= 
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F.2 Proof of Theorem H 

Let us fix {Xij}jli and {\2k}k=i as they are defined in Lemma [H We have 
^max^ mm r , + min^ R% k (K) 



O^K^S j=l,...,Ki 



k=l,...,K 2 



< max min R\j + a min R 2 k 

(E/,X) j=l,...,ifi k=l,...,K 2 

Ki K 2 

< max £ A y [/(X; Y) | [/) - /(X; Z | 17)] + M £ A 2fe [1(17; Y=) -1(17; Z 
i=l fe=i 

r i ivili 



max > Aio 
(C/,X) ^-f 

#2 



/»(Y}|l7)-/i(Z|tO-^log^| 



/i^A 2fc [/.(Y^) - fc(Z|C0] 



fe=i 



< max > Aii 

(t/,X) 



MYj|/7)-Mz|^)-^io g S n 

2 l^z 



fc=i 



A' 2 



/i^A 2fc ^log 



fc=i 



|S + Sxl 



A' 2 



max 

0-<K-<S 



-/*X>» [h(Yl\U) - h(Z\U)] 

k=l 

Ki 



1 |K + S}| 
- log — ; — tt^- 

2 6 Si 



- log ; :— 

2 8 S z 



A 2 

+ /i A 2fc 
/c=i 



|S + S^| I 

,21 2 



|S + S Z | 



^max^ min^ min^ i?^,(K) 



(216) 
(217) 



(218) 



(219) 



OHKHS j=l,...,Kx 



k=l,...,K 2 



(220) 
(221) 



where (12191) comes from the fact that 



h{Yi) - h(Z) < - log 



|S + S Z | 



k 



(222) 



which is a consequence of the worst additive lemma in Lemma HJ (12201) results from Lemma [21 
(12211) is due to Lemmas Q] and [2j Thus, we have shown that 

max min R?-CK) + a min R9 h CK) = max min Ru + u min R 2 k (223) 

OdK_KS j=l,...,Ki ■> k=l,...,K 2 (U,X) j=l,...,Ki J k=l,...,K 2 

for fj, < 1, which completes the proof of theorem. 
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G Proof of Theorem [8 



We first show the achievability of the region given in Theorem [HJ then provide the converse 
proof. 

G.l Achievability 

We fix the distribution p(u, x). 

Codebook generation: 

• Generate 2 n ^ R2+R2 ^ length- n u sequences through p(u) = YYi=iP( u i)- Consider the 
permutation ttu on {1, . . . , Kz} such that 

m z M1) ) < < m z.viK Z) ) (224) 

We set R 2 as 

R 2 = max I{U-Z t ) = I{U-Z^ u(Kz) ) (225) 

t=l,...,Kz 

We index u sequences as u(w 2 , w 2 i, ■ ■ ■ , w 2 k z ) where w 2 G {1, . . . , 2 n ^ 2 }, and u>2t €= 
{1, . . . , 2 ni * 2 *}, t = l,...,K z . R 2t is given by 

R 2t = I(U; Z wu(t) ) - I(U; Z^^), t = l,...,K z (226) 

where we set I(U; Z wu ^) = 0. We note that 

m 

J2R2t = I(U;Z nu{m) ) (227) 



t=i 



and in particular, for m = if^, 

k z 

Y^R 2t = I(U-Z 7Iu{ K z ))= max I(U;Z t ) = R 2 (228) 

For each u, generate 2 n ^ Rl+Rl ^ length-ra x sequences through p(x|u) = YYl=i P( x i\ u i)- 
Consider the permutation 7T X on {1, . . . , K 2 } such that 

I(X-X x(1) \U) < ... < I{X-X xi K 2 )\U) (229) 
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We set Ri as 

R x = I(X; Y* x{K2) \U) = k m« I(X; Y k 2 \U) (230) 

We index x sequences as ~x.(wi,Wu, . . . , Wik 2 \^2) where w 2 = (w 2 , w 2 x, ■ ■ ■ , w 2 k z ), 
iwx G {1, . . . , 2^}, and u) lfc G {1, . . . , 2"^}, k = l,...,K 2 . R lk is given by 

= /(X; Y 2 x[k) \U) - I(X; Y 2 x{k _ x) \U), k = 1, . . . , K 2 (231) 

where we set I(X; Y%, Q -MJ) = 0. We note that 

m 

Y,Rik = I{X-X x{m) \U) (232) 
fc=i 

and in particular, for m = K 2 , we have 

= 1{X-X x{ k 2 )\U) = max /(X;y fc 2 |f/) = R t (233) 

ft=l 

Encoding: 

If (wi, w 2 ) is the message to be transmitted, we pick {wik)^=i and {w 2t ]fji independently 
and uniformly, and send the corresponding x. 

Decoding: 

The legitimate users can decode the messages with vanishingly small probability of error, 
if the rates satisfy 

R 1 + R 1 < min I(X-Y}\U) (234) 

j=l,...,K 1 

R 2 + R 2 < min I(U; Y 2 ) (235) 

k=l,...,K2 

where we used the degradedness of the channel. Plugging the expressions for R x and R 2 
given in ( 12251) and ( 12301) . we can get 

i?x< min I{X-YhU) - I{X;Y 2 \U) (236) 

j=l,...,Ki 
k=l,...,K 2 

R 2 < min I(U; Y 2 ) - I(U; Z t ) (237) 

fe=l,...,K 2 
t=l,...,K z 

which is the same as the region given in Theorem [8] because of the degradedness of the 
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channel. 



Equivocation computation: 

We now show that this coding scheme satisfies the secrecy requirements given in fl66l) and 
dSTJ). We start with (JBBJ) 

H(W 2 \Z: u(t) ) = H(W 2 , Z: u(t) ) - H(Z: u(t) ) (238) 

= H(W 2 , Z: u{t) , U n ) - H(U n \W 2 , Z: u{t) ) - H(Z: u[t) ) (239) 

= H(U") + H(W 2 , Z: u(t) \U n ) - H(U n \W 2 , Z: u(t) ) - H{Z: u[t) ) (240) 

> H(U n ) - I(U n - Z: u{t) ) - H(U n \W 2 , Z: u{t) ) (241) 

where we treat each term separately. Since U n can take 2 n ^ R2+R2 ^ values uniformly, for the 
first term, we have 

H{U n ) = n{R 2 + R 2 ) (242) 

Following Lemma 8 of [1], the second term in (124 ip can be bounded as 

I(U n ; Z: u{t) ) < nI(U; Z nu(t) ) + ne 2 , n (243) 

where e 2 ^ n — > oo as n — » oo. We now consider the third term of (124 ip 

H{IT\W 2 , Z: u{t) ) < H(U n , W 2(t+1)l . . . , W 2Kz \W 2 , Z: u{t) ) (244) 
< H{W 2(t+1) , W 2Kz ) + H(U n \W 2 , W 2(t+1) , W 2Kz ,Z: u{t) ) (245) 

The first term in (12451) is 

K z 

H(W 2(t+1) , W 2Kz ) = H (W 2 i) (246) 

i=t+i 

K z 

i=t+i 

= nI(U; Z Wu{Kz) ) - nI(U; Z nu{t) ) (248) 

where ( I246[) is due to the independence of {W 2t }^i, (12471) is due to the fact that W 2t can 
take 2 nR2t values uniformly and independently for t = 1, . . . , Kz, and in (12481) . we used the 
definitions of {R 2 t}fji given in (I226p . We next consider the second term in (12451) . For that 
purpose, we note that given 

^ W 2 = w 2 , W 2 ( t +i) = W2(t+i), ■■■ , W 2Kz = w 2Kz ) (249) 
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U n can take 2 nI{U;Z *u(^ values. Thus, given the side information in (I249p . the 7Tu(t)th 
eavesdropper can decode U n with vanishingly small probability of error, which implies that 

H(U n \W 2 ,W 2{t+1) ,...,W 2Kz ,Z^ {t) )<n l2 , n (250) 

due to Fano's lemma where j 2 ,n —> as n — > oo. Hence, plugging (12481) and (I250p in (12451) 
yields 

H(U n \W 2 , Z: u(t) ) < nI(U; Z nu{Kz) ) - nI(U; Z„ u{t) ) + n l2 , n (251) 

Finally, using (12421) . (IMHD and (12511 in (EH} yields 

H{W 2 \Z2 u(t) ) > n{R 2 + R 2 ) - ne 2 , n - nI{U; Z nu{Kz) ) - n 72 , n (252) 
= nR 2 - n(e 2 ,„ + j 2>n ) (253) 

where we used (I225p . Since (12531) implies (1661) . the proposed coding scheme ensures perfect 
secrecy for the second group of users. 

We now consider the second secrecy requirement given in ( 1671) . 

HiW^Y^) > H{W 1 \W 2 ,Y^ {ky U n ) (254) 

= HiW^Yg^lT) (255) 

= mWi,Y?» h) \V») - H{Y^ (k) \U n ) (256) 

= HiX-^X^Un ~ H(X»\W lt Y?» (hV ir) - H(Y^ n (k) \U n ) (257) 



-H{y^\U n ) (258) 
> tfpT|[/") - I(X n ;Y^[ k) \U n ) - H(X n \W h Y^ {kV U n ) (259) 

where (1255ft is due to the Markov chain W 2 — > — > (Wij Y^ x n ( fc )) which originates from the 
coding scheme we proposed. Since given £/ n = X n can take 2 n (- Rl+ - Rl ) values uniformly 
and independently, the first term in (12591) is 

H{X n \U n ) = n{R x + R x ) (260) 

Following Lemma 8 of [1], the second term in f!259j) can be bounded as 

I(X n ;Y^ n (k) \U n ) < nI{X-X x{k) \U) + ne %n (261) 
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where ei jn — > as n — > oo. We now consider the third term in (12590 

H(X n \W u U n ,Y^( k) ) < H(X n ,W 1(k+1) ,...,W 1K2 \W u U n 1 Y^ {k) ) (262) 

< H{W x{k+x) , ...,W 1K2 ) + H(X n \W x , U n , Y^ {k) ,W 1{k+1) , ...,W 1K2 ) 

(263) 

where the first term is given by 

H(W 1(k+1) , W 1K2 ) = H (W U ) (264) 

l=k+l 
l=k+l 

= nI(X; Y* x{K2) \U) - nI(X; Y^ (k) \U) (266) 

where (I264p is due to the independence of {Wifc} fc J 1; (I265P comes from the fact that W xk can 
take 2 nRlk values uniformly and independently, and in (12660 . we used ( 12310 . We now bound 
the second term of ( 12630 . For that purpose, we first note that given 

( U n = u n , W x = w x , W l{k+l) = w x{k+x) , ... , Wik 2 = wik 2 ) (267) 

X n can take 2 nI(X '' Y *xW lu) values. Thus, given the side information in ( 12670 . the nx{k)th 
user in the second group can decode X n with vanishingly small probability of error leading 
to 

H{X n \W u Xr,Y^ kV W x{k+x) ,...,W 1Ki ) <n lx>n (268) 
due to Fano's lemma where 7i 5 „ — > as n — > oo. Plugging (I266P and (I268j) into (12630 yields 
H(X"\W x ,U n ,Y^ (k) ) < nI(X;Y^ (K2) \U)-nI(X-X x(k) \U)+n 7hn (269) 
Finally, using (12601) . (I26TD and (l269|) in (1259]) results in 



H(W X \W 2 , Y*£ k) ) > nR x + nR x - nI(X; Y* x{K2) \U) - n(e x , n + 7 i,„) (270) 
= - n(e x>n + j x>n ) (271) 

where we used (I230p . Since this implies (|67|) . the proposed coding scheme ensures perfect 
secrecy for the first group of users, completing the proof. 
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G.2 Converse 

First, we note that for an arbitrary code achieving the secrecy rate pairs (Ri, _R 2 ), there exist 
( e i,n 5 e 2,n) and (7i, n , 72,71) which vanish as n — > 00 such that 

#(W^/' n ) < ne ljn , j = 1, . . . , Ki (272) 

#(^ 2 |lf n ) < ne 2 , n , fc=l,...,K 2 (273) 

I{W 2] Z?)<n l2>n , t=l,...,K z (274) 

J(Wi; y fc 2,n | W 2 ) < n 7 i, n , fc = 1, . . . , K 2 (275) 

where (12721) and ( 1273ft are due to Fano's lemma, and (1274ft and (1275ft come from perfect 
secrecy requirements in ( 1661) and (167|) . 

We now define the following auxiliary random variables 

U i = W 2 Y**- 1 Z?? v i = l,...,n (276) 

which satisfy the Markov chains 

U^X^Yl^Y: ^Yl^Zl^Z^ i = l,...,n (277) 

for any (j,k,t) triple. The Markov chain in (12771) is a consequence of the fact that the 
channel is memoryless and degraded. 
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We first establish the desired bound on R 2 as follows 

nR 2 = H{W 2 ) (278) 

<I(W 2 ;Y*' n ) + ne 2 , n (279) 

< I(W 2 ; if n ) - I(W 2 ; Z?) + n(e 2 , n + 7a , ft ) (280) 
= I(W 2 ; if n |Z t n ) + n(e 2 , n + 72 ,n) (281) 

n 

= HW 2] Y^Z?, Y^" 1 ) + n(e 2 , n + 72 , n ) (282) 
i=i 

= £ J(W 2 ; l^Ki+i, n a,W . + + 7 2,n) (283) 

1=1 

n 

< I(Z?, i+l , Y^~\ W 2] Y^Zv) + n(e 2 , n + 72 , n ) (284) 
i=i 

n 

< Y*'-\ Z£ i+1 , Y^~\ W 2 ; Y k 2 t \Z t ,) + n(e 2 , n + 72 , n ) (285) 
i=i 

n 

< Wi, Y*^\ W 2 - Yl t \Z ttl ) + n(e 2 , n + 72 , n ) (286) 

i=l 

= £ /([/,; l^|Z t ,) + n(e 2 , n + 72 , n ) (287) 



i=i 



where (12811) is due to the Markov chain 

W 2 -> if™ -> Z t n (288) 

which comes from the fact that the channel is degraded, f)283p results from the Markov chain 

Zr^Y^^iW^Z^ (289) 

which is a consequence of the fact that the channel is memoryless and degraded, and (12861) 
is due to the Markov chain 

(^+i, Y^" 1 ) - (Z*£, Y^ 1 ) -> (W 2 , if, Z t ,) (290) 

which is a consequence of the Markov chain in (j2J). 
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We now establish the bound on R\ as follows 

nR x = H{W X ) (291) 

= H{Wx\W 2 ) (292) 

</(^ i; y/' n |^ 2 )+ne lin (293) 

< J(W i; Y}> n \W 2 ) - J(Wi; y fc 2 ' n | W 2 ) + ra(e 1>n + 7l , n ) (294) 

= J(iy i; F/' n |iy 2 , y fc 2 ' n ) + n(e ljB + 7 i,n) (295) 

n 

= £ H^D Y U W ^ Y k\ Yj 1 '" 1 ) + n(ei, n + 7 i,n) (296) 



= £ I(W i; Y* t \ W 2 , Y*» v Y^-\Yl) + n(e 1|B + 7 i, B ) (297) 
i=i 

n 

= ^/(W 1 ;y/jW 2 ,y fc y +1 ,y/' J -\Z^ 1 ,y^- 1 ,y fc y + n(e ljB + 7l , B ) (298) 
i=l 

n 

= 1(Wi\ Y^U U y fc 2 f +1 , Y^\ Y* t ) + n(e 1)TI + 7l ,„) (299) 

i=l 
n 

< £ i(x<, w i; y^, y fe 2 f +1 , r/'^ 1 , y fc 2 ,) + n(e liB + 7l , n ) (300) 
1=1 

n 

= £ J(X,; Y^Uu YZ* V Y^-\Y^) + n(e lin + 7l , n ) (301) 
i=i 

n 

i=i 

+ n(ei >n + 7 i, n ) (302) 

n 

= £ H(y^| c^, y fc 2 f +1 , y/^ 1 , y fc 2 ,) - # (y^t/*, y fc % x 4 ) + n( CliB + 7 i, B ) (sos) 
i=i 

n 

< £ # (Y^, y|.) - H(Y} 4 \U t , Yl, X t ) + n(e 1|B + 7l ,„) (304) 

i=l 
n 

= E n 2 ,) + n(ei, B + 7i,n) (305) 



i=l 



where (12951) is due to the Markov chain 

(W 1 ,W 2 )^Y J 1 ' n ^Y*' n (306) 
which comes from the degradedness of the channel, (12971) results from the Markov chain 

y?- 1 - if- 1 - (wi, w a> yf y fc 2 f ) (307) 
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which is again due to the degradedness of the channel, (12981) is a consequence of the Markov 
chain 

(Z*£, Y*^) -> (Y k J +1 , Y^ 1 ) -> (W 2 , W u YD (308) 

which results from the Markov chain in (j2j), (13011) comes from the Markov chain 

(Xk,n Iji) - *i - (Wi, W 8 , Ui, Y*» v Y^~ l ) (309) 

which is due to the fact that the channel is memoryless, (13031) is also due to the Markov 
chain in ( 1309D . and (13041) comes from the fact that conditioning cannot increase entropy. 

Single-letterization can be accomplished as outlined in the proofs of Theorems CD and [2J 
completing the converse proof. 



H Proof of Theorem 9 



The achievability of the region given in Theorem [9] can be shown by selecting (U, X) 



(Ui,X±, . . . ,Ul, Xl) with a joint distribution of the form p(u, x) = Yle=i P{ u ii x i)- We next 
provide the converse proof. To that end, we define the following auxiliary random variables 

Uit = W 2 Y*^ l Zg 1 Yfa_ 1]tt Zfc lsLUi i = l,...,n, £ = l,... t L (310) 

which satisfy the Markov chains 

Ut } i — > X^i — > — > Y^ — > Yj^ ti — > Zlj — »■ Zu,%i i = l,...,n, £=1,...,L (311) 

for any (J, k, t) triple. These Markov chains are a consequence of the facts that the channel 
is memoryless and degraded, and sub-channels are independent. 

We first establish the desired bound on R 2 . For that purpose, following the proof of 
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Theorem [HI we get 

n 

nR 2 < E r ( W * Y U Y k^\ Z^i+v Zu) + n(e 2 , n + 7 2,„) (312) 

i=l 
n L 

= E E I{W ^ Y kli\ Y k^\ Z^ l+1 , Z t „ y fc 2 [1:£ _ 1]i4 ) + n(e 2in + 72>n ) (313) 
i=i i=i 

n L 

= E E Y h\ Y k"^ Z ?,i+i> Z m i:L],i, Y^u-in, Zu,) + n(e 2>n + 72 , n ) (314) 

i=l 1=1 
n L 

< E E A^*'* \ ^i+i) Z* i+ i. L ],i, Yk % X ' Zt[£+i-.L],i, Yk[i:e~i},n W 2 ; Y^Z u ,i) 

i=l 1=1 

+ n(e 2 , n + 72,n) (315) 

n L 

= E E J ( y *'" 1 > W * Y k*\Zu,i) + <e 2>n + 72 ,n) (316) 

i=i i=i 

n L 

= ^^/([/,, 4 ;4|Z„, ! ) + n(e 2 , n + 72 ,„) (317) 



i=l 1=1 



where (13141) comes from the Markov chain 



Zt\l:£-l\,i —> Yk\l-l-l],i ~ * (^2, ^fc , ^ t "+l, Zt[l:L],i) (318) 

which is a consequence of the facts that the channel is memoryless and sub-channels are 
independent, (13161) results from the Markov chain 

( Y k' 1 1 !^'i+lJ^+l:i],i)^fe 2 [i:£_l] ) i) — > (Y*' 1 1 , Z[t+l:L]j) Y \l:e-l],i) ~^ (^2, Y^, Z t£ji ) 

(319) 

which is a consequence of the Markov chain in (1101) . 
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We now bound Ri . Following the proof of Theorem [BJ we get 

n 

nRi < E W; Yl ti \W 2 , Y^~\ Y k 2 t ) + n(e 1>n + 7l , n ) (320) 

i=l 
n L 

= E E Y U W ^ ^i'*" 1 . F fcK-i> *m> ^-i],J + + 7i,») (321) 
i=i e=i 

n L 

= E E F A^> i? ,4 - x , nT + i' Y w+^ ^i^u^ F i) + ™( e ^ + ti,») (322) 
i=i £=i 

i=l £=1 
n L 

i=l £=1 
n L 

= E E ^(^>*' F /'* 1 ' *M+1> ^+l:L],i) F /[l:£-l],i> F fcl,») + n ( e l,n + ll,n) (325) 

i=l £=1 
n L 

1=1 £=1 

~ #0#,il^,ij Yj'* ,Y^ +1 , Y^ £+1:L]ti , Y^ 1:t _-q ti , Y^, X^) + n(ei jn + 7 liW ) (326) 

n L 

— E E ^"(^Ail^M' ^j 1 '* \ Y^f +1 , ^f[l:i-l],i, ^feg,i) ~ H^Y^^U^, Y^, X( }i ) 

i=l £=1 

+ n(e 1)n + 7i,n) (327) 

n L 

< E E # ( y /d *«,<) - H i Y hP^ Y lv X ^ + ™( e M + 7i,n) (328) 
1=1 £=1 

n L 

= E E 7 ( X ^ F A^' Y ^) + ™( e M + 7i,n) (329) 

i=l 1=1 

where (13221) is due to the Markov chain 

Yk[i : e-i],i —* ^|i:£-i],i —* W 2 , Y-' 1 , Y" fc 2 -™ v Y^.^, Y^J (330) 

which is a consequence of the degradedness of the channel, and the fact that sub-channels 
are independent and memoryless, (13231) results from the Markov chain 

(Y*' 1 \ Zj.fi> Z[e+kL],n (Yj 1 ' 1 1 iY^ 1 ,Y^^ t+1 .^ i ,Y^ lu _ 1 ^ i ) — > (Wi, W 2 , Y^, Y^) 

(331) 

which is a consequence of the Markov chain in (1101) , (I325p and (13271) come from the Markov 
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chain 



(Wi, Ui t i, Yj ' % , Y k '™ +V Yk[i+v.L\,n ^j[vi-i],i} ~~ * — > (1/.^, y,^) (332) 

which is a consequence of the fact that sub-channels are independent and memoryless. 

We can obtain the desired single-letter expressions as it is done in the proof of Theorem [2j 
completing the proof. 



I Proof of Theorem [101 



According to Theorem [31 there exists a P* < P such that 



h(X + N\U) — h(X + N* | U) = \ log (333) 



h{X + N\U)- h(X + N 2 \U)<1 log ^4 ( 334 ) 



MX + JV|Z7) - + iVi|Z7) > J log ^4 (335) 



for any (erf, erf) as long as they satisfy 

<xi < <t* < ^ < & (336) 
We first show flTgj) . To this end, we note that (13331) and (I334p imply 

h(X + N 2 \U) - h(X + N*\U) > ilog^4 (337) 

2 P* + a% 

Furthermore, (13331) and (I335j) imply 

h(X + N*\U) - h(X + NtlU) > ilog^4 (338) 

2 P* + o{ 



Combining (13371) and (I338j) yields 



1, P* + a. 



/i(X + JV 2 |E/) - + JVi|l7) > - log — — -f (339) 

2 P* + a{ 

which is the desired result in f l75|) . 

We now show (177)) . We first note that we can write N as 

N = N 2 + VtN z (340) 
where iV^ is a zero- mean Gaussian random variable with variance <r| — cr|, and independent 
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of (U, X, N 2 ). t in ([MOD is S iven b Y 



o 2 ~ o\ 



(341) 



where it is clear that t G [0,1]. We now use Costa's entropy power inequality [15] to arrive 
at (1771) 



e 2/i(X+iV|E/) = e 2/i(X+Af 2 +v / t^Vz|[/) 



> (1 _ t -j e 2h(X+N 2 \U) + te 2h(X+N z \U) 



(342) 
(343) 



which is equivalent to 

e 2[h(X+N\U)~h(X+N 2 \U)] > fi _ t \ + te 2[h(X+N z \U)-h(.X+N 2 \U)) 



(344) 



which can be written as 



h(X + N Z \U) - h{X + N 2 \U) < - log 

1 , 

< - log 

= - log 

2 5 



1 2[fc(X+JV[tf)-/i(X+./Va|tf)] _ 1 ~ t 



IP* +& 2 1-t 



t P* + o\ t 



P* 



1 a 2 - (1 - t)al 



P* +o% t P* + cr| 



(345) 
(346) 
(347) 
(348) 



where (13461) is due to (13341) and (13481) comes from (1341j) . Since (13481) is the desired result in 
(177]) . this completes the proof. 



J Proof of Theorem [TT 



Achievability is clear. We provide the converse proof. We fix the distribution Y\i=i P( u ii x i) 
such that 



E [Xj] = P e 



1,...,L 



(349) 
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and J2e=i Pi = P- We first establish the bound on P 2 given in (|80|) . To this end, we start 
with (1731) . Using the Markov chain Ug — > Y k \ — > Z^, we have 

P 2 < min V/(^ ; y fc 2 ,)-/(^;Z«) (350) 

fc=l,...,if2 ' 

t=i,...,i<r z <=i 

L 

= min ]T - h(Z u ) + [M^) " h(X&\U t )] (351) 

f£=l,...,A2 ' ' 

t=l,...,-ff z «=1 

^ „ ? ir V E \ lo S pt aT + [KZu\U t ) - h(Y k 2 e \U e )] (352) 
where (13521) comes from the fact that 

h{Y k \) - h{Z u ) (353) 

is maximized by Gaussian distribution which can be shown by using the entropy power 
inequality [18,19]. We now use Theorem [TU1 For that purpose, we introduce Ay and A* z 
which satisfy 

Aj 1 A y * A*l A z * Af (354) 
for any (J, k, t) triple, and in particular, for the diagonal, elements of these matrices, we have 

^-j,gg — ^y,u — ^l,ee — ^z.ti — ^H,gg (355) 
for any (J, k,t,£). Thus, due to Theorem [TOj for any selection of {(Ug, Xe)}f =1 , we have 

P; < P £ (356) 
h(Z te \U e ) -h(Y*\U e ) < \ log P " ± f/ £ (357) 

h(Y k \\Ug) - h(y£\U e ) > \ log ^ ^ (358) 

for any (k,j,t,£). Using (13571) in (13521) yields 

El Pg + A| ££ 1 P^ + Af u 
-log log —y— (359) 



By defining P/ = /^P<? and f3g = l — (3g, I = 1, . . . , L, where (3g G [0, 1] due to (I356p . we get 
the desired bound on P 2 given in ( [801 . 



We now bound Pi. We start with (1721) . Using the Markov chain Ug — > Xg — ► — > Y" fc 2 , 
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we have 



R x < . min V I(X £ ; Y} t \U t ) - I(X e ; Y&\U e ) (360) 

j=l,...,Ki • J 
k=l,...,K 2 1=1 



fe=l','."',K'a £=1 fc '^ 

El -f^ ■fA'iij 1 A-« 
-log -£ log , ^ 362 



fc=i,...,K 2 



min - log [ ] 

7=1,-,^ 2 S l 



ndn^^5log(l + ^j-ilogfl + g| (363) 



where (13621) is due to (13581) . Since (I363H is the desired bound on R\ given in ( 1751) . this 
completes the proof. 



K Background Information for Appendix [L 

In Appendix [L], we need some properties of the Fisher information and the differential en- 
tropy, which are provided here. 

Definition 1 ([3], Definition 3) Let (U, X) be an arbitrarily correlated length-n random 
vector pair with well-defined densities. The conditional Fisher information matrix o/X given 
U is defined as 

J(X|U) = E [p(X|U)p(X|U) T ] (364) 

where the expectation is over the joint density f(u, x), and the conditional score function 
p(x|u) is 



p(x|u) = Vlog/(x|u) 



01og/(x|u) dlog/(x|u) 



dx\ dx n 



(365) 



The following lemma will be used in the upcoming proof. In fact, an unconditional 
version of this lemma is proved in Lemma 6 of [3]. 

Lemma 5 Let T,U, V l5 V 2 be random vectors such that (T,U) and (Vj., V 2 ) are indepen- 
dent. Moreover, let V l5 V 2 be Gaussian random vectors with covariances matrices Si,S 2 
such that -< Si ^ S 2 . Then, we have 

J _1 (U + V 2 |T) - £ 2 y J- X (U + Vi|T) - S x (366) 
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The following lemma is also instrumental for the upcoming proof whose proof can be 
found in [3]. 

Lemma 6 ([3], Lemma 8) Let K l5 K 2 be positive semi- definite matrices satisfying ^ 
Ki ^ K 2; and f(K) be a matrix-valued function such that f(K) y for Ki ^ K ^ K 2 . 

Then, we have 

/ f (K)dK > (367) 

JKi 

The following generalization of the de Bruin identity [18, 19] is due to [22]. In [22], 
the unconditional form of this identity, i.e., the case where U — <fi, is proved. However, 
its generalization to this conditional form for an arbitrary U is rather straightforward, and 
given in Lemma 16 of [3]. 

Lemma 7 ([3], Lemma 16) Let (U,X) be an arbitrarily correlated random vector pair 
with finite second order moments, and be independent of the random vector N which is 
zero-mean Gaussian with covariance matrix S^v >- 0. Then, we have 

V SiV /i(X + N|U) = ^J(X + N|U) (368) 



L Proof of Theorem [121 



According to Theorem [5j for any selection of (U, X), there exists a K* ^ S such that 

MX + N* I U) - h(X + N 2 1 U) = \ log |**^*| (369) 
MX + N* | U) - h(X + N x | U) > i log (370) 

for any such that Hi -< S 2 . Furthermore, K* satisfies [3] 

K* ■< J -1 (X + N*|t/") — S* (371) 
Equations (13691) and (13701) already imply 

MX + N 2 1 U) - MX + Nx 1 17) > i log (372) 



for any Si such that Si ^ S 2 , which is the desired inequality in ([85 
We now prove (1831) . For that purpose, we note that (13711) implies 



K* ■< J" (X + N|t/") — S^ (373) 
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for any Gaussian random vector N, independent of (U, X), with covariance matrix Sjy such 
that Ejv ^ S* because of Lemma [51 The order in (13731) is equivalent to 

J(X + N\U) ^(K* + -E N y\ (374) 

Now, we can obtain (153"]) as follows 

h(X + N Z |C7) - h(X + N 2 \U) = h{X + N Z \U) - h(X + N*\U) 

+ h(X + N*\U) -h(X + N 2 \U) (375) 

= h(X + N Z \U) - h(X + N*\U) + \ log j|| ± gj (376) 

1 /" Sz 1 IK* 4- 5]* I 

= 2~/ s , J( X + N I^) ^ + 2 lQg |K* + S 2 | (377) 

1 /" Sz ^* _ x_i 1, IK* + S* 



^2 ( K * + S ^ S - + 2 l0g ^T^ ( 378 > 

< ~ l"g '5 + ^! (379) 
~ 2 6 |K* + S 2 | v ; 

where (I376p is due to (13691) . (I377p is obtained by using Lemma d and (13781) comes from 
Lemma [6] by noting (13741) . Since (13791) is the desired inequality in (1531 . this completes the 
proof. 



M Proof of Theorem [L3] 



We first establish the desired bound on R 2 given in (1561) as follows 

R 2 < min J(Z7; Y 2 ) - I(U;Z t ) (380) 
t=i,...,_ft: z 

= min h(Y 2 ) - h(Z t ) + [h(Z t \U) - h(Y 2 \U)] (381) 
t=i,...,K z 

< 4= mm ^ 1 log + [tyZtlEO - ^(Y 2 |t/)] (382) 

where (I380p comes from Theorem [5J by noting the Markov chain U —* Y 2 — > Z t , and (13821) 
can be obtained by using the worst additive noise lemma, i.e., Lemma HI as it is done in the 
proof of Theorem [61 We now use Theorem [121 According to Theorem [121 f° r an Y selection 
of (17, X), there exists a positive semi-definite matrix K such that K H S and 



1 |K + £f 



h(Z t \U) -h(Y 2 \U) < - log ' , * (383) 



MY 2 |f/) - fc(Y}|!7) > ^log^f^ (384) 
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for any (j, t) pair. Using (13831) in f!382j) yields 



1, |S + S 2 | 
R 2 < mm - log — — — — 7 

- t=i,...,K z 2 6 K + £ 2 



1 |S + Sf| 

2 hs WTW\ 



which is the desired bound on R 2 given in (186]) . 

We now obtain the desired bound on Ri given in (185]) as follows 



(385) 



R 1 < min /(X; Y, 1 1 U) - /(X; Y 2 1 U) 

j=l,...,K 1 

= . min h{Y]\U) - h(Y 2 \U) - \ log ||| 

1 |K + S}| 1 |K + S 2 | 
< mm - log — , . -, , log — — — : — 

-j=i,..,Kx2 & 2 s |S 2 | 



(386) 
(387) 
(388) 



where (I386p comes from Theorem [8] by noting the Markov chain U — > X — > Yj — >• Y 2 and 
( 13881) is obtained by using (13841) . Since (1388}) is the desired bound on Ri given in ( |85i) . this 
completes the proof. 
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